Scalable Problems and Memory-bounded Speedup Scalable Problems and Memory-bounded Speedup
نویسنده
چکیده
In this paper three models of parallel speedup are studied. They are xed-size speedup, xed-time speedup and memory-bounded speedup. The latter two consider the relationship between speedup and problem scalability. Two sets of speedup formulations are derived for these three models. One set considers uneven workload allocation and communication overhead, and gives more accurate estimation. Another set considers a simpliied case and provides a clear picture on the impact of the sequential portion of an application on the possible performance gain from parallel processing. The simpliied xed-size speedup is Amdahl's law. The simpliied xed-time speedup is Gustafson's scaled speedup. The simpliied memory-bounded speedup contains both Amdahl's law and Gustafson's scaled speedup as special cases. This study leads to a better understanding of parallel processing.
منابع مشابه
Scalable Problems and Memory-Bounded Speedup
In this paper three models of parallel speedup are studied. They are fixed-size speedup, fixed-time speedup and memory-bounded speedup. The latter two consider the relationship between speedup and problem scalability. Two sets of speedup formulations are derived for these three models. One set considers uneven workload allocation and communication overhead, and gives more accurate estimation. A...
متن کاملScalable Computing in the Multicore Era
Multicore architecture has become the trend of high performance processors. While it is generally accepted that we have entered the multicore era, concerns exist on scaling multicore processors. Technology is available, but major vendors are hesitant in entering the multicore market with processors that have large number of cores, citing Amdahl’s law. This is a very interesting phenomenon, wher...
متن کاملScalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers
Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity O(N ), where 2 < 3. We show that such an algorithm can be parallelized on a distributed memory parallel computer (DMPC) in O(logN) time by using N = logN processors. Such a parallel computation is cost optimal and matches the performance of PRAM. Furthermore, our parallelization on a DM...
متن کاملScalable VLSI Design for Fast GF(p) Montgomery Inverse Computation
This paper accelerates a scalable GF(p) Montgomery inversion hardware. The hardware is made of two parts a memory and a computing unit. We modified the original memory unit to include parallel shifting of all bits which was a task handled by the computing unit. The new hardware modeling, simulating, and synthesizing is performed through VHDL for several 160-bits designs showing interesting spee...
متن کاملParallel Processing Using the Silicon Graphics / Cray Origin 2000
The Origin 2000 is a high performance computing platform produced jointly by Silicon Graphics / Cray. This scalable shared memory processor (SSMP) may be configured with up to 128 processors in a single system image. The Origin is a scalable, cache coherent, non-uniform memory access (CC-NUMA), distributed shared memory (DSM) architecture based on a hypercube interconnection topology. Effective...
متن کامل